{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "无通信的多智能体就是多组单智能体而已.分别实现即可.\n",
    "\n",
    "每一组智能体有各自的actor,critic,并且不通信任何的数据."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAR8AAAEYCAYAAABlUvL1AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/P9b71AAAACXBIWXMAAA9hAAAPYQGoP6dpAAAlrElEQVR4nO3df1RU550/8PcdBkZ+ODOCMoNfQdmEBqk/YsTgRNOkcSpJaVojTROXGJq1unHRiGzdlE2ire2KxzQnra1ik5yqe5rolk3ND+OPJZhgjCMoiQYhIZqQA1UHVDIzoDLA3Of7R8ttRk3iwMAzg+/XOfcc5z7PzP3ce7hvn7k/5ipCCAEiokGmk10AEV2fGD5EJAXDh4ikYPgQkRQMHyKSguFDRFIwfIhICoYPEUnB8CEiKRg+RCSFtPDZsGEDxo0bh2HDhiErKwvV1dWySiEiCaSEz//8z/+gqKgIq1atwnvvvYfJkycjOzsbra2tMsohIgkUGTeWZmVlYdq0afj9738PAFBVFcnJyVi6dCl+9rOfDXY5RCSBfrAX2NXVhZqaGhQXF2vzdDod7HY7HA7HVd/j9Xrh9Xq116qqoq2tDQkJCVAUZcBrJqJrI4RAe3s7Ro8eDZ3uq79YDXr4nDt3Dj6fDxaLxW++xWLBRx99dNX3lJSU4Be/+MVglEdEQdDc3IwxY8Z8ZZ9BD5++KC4uRlFRkfba7XYjJSUFzc3NMBqNEisjoi/yeDxITk7G8OHDv7bvoIfPyJEjERERgZaWFr/5LS0tsFqtV32PwWCAwWC4Yr7RaGT4EIWgazkcMuhnu6KiojB16lRUVFRo81RVRUVFBWw222CXQ0SSSPnaVVRUhPz8fGRmZuLWW2/Fb37zG1y4cAGPPPKIjHKISAIp4fPAAw/g7NmzWLlyJZxOJ26++Wbs2bPnioPQRDR0SbnOp788Hg9MJhPcbjeP+RCFkED2Td7bRURSMHyISAqGDxFJwfAhIikYPkQkBcOHiKRg+BCRFAwfIpKC4UNEUjB8iEgKhg8RScHwISIpGD5EJAXDh4ikYPgQkRQMHyKSguFDRFIwfIhICoYPEUnB8CEiKRg+RCQFw4eIpAg4fPbv3497770Xo0ePhqIoeOWVV/zahRBYuXIlkpKSEB0dDbvdjhMnTvj1aWtrQ15eHoxGI8xmMxYsWICOjo5+rQgRhZeAw+fChQuYPHkyNmzYcNX2devWYf369di0aROqqqoQGxuL7OxsdHZ2an3y8vJQV1eH8vJy7Ny5E/v378eiRYv6vhZEFH5EPwAQO3bs0F6rqiqsVqt4+umntXkul0sYDAaxbds2IYQQ9fX1AoA4fPiw1mf37t1CURRx6tSpa1qu2+0WAITb7e5P+UQUZIHsm0E95tPY2Ain0wm73a7NM5lMyMrKgsPhAAA4HA6YzWZkZmZqfex2O3Q6Haqqqq76uV6vFx6Px28iovAW1PBxOp0AcMUz1y0Wi9bmdDqRmJjo167X6xEfH6/1uVxJSQlMJpM2JScnB7NsIpIgLM52FRcXw+12a1Nzc7Pskoion4IaPlarFQDQ0tLiN7+lpUVrs1qtaG1t9Wvv6elBW1ub1udyBoMBRqPRbyKi8BbU8ElNTYXVakVFRYU2z+PxoKqqCjabDQBgs9ngcrlQU1Oj9dm3bx9UVUVWVlYwyyGiEKYP9A0dHR04efKk9rqxsRFHjx5FfHw8UlJSUFhYiF/96ldIS0tDamoqnnrqKYwePRpz5swBAIwfPx533303Fi5ciE2bNqG7uxtLlizBgw8+iNGjRwdtxYgoxAV6Ku2tt94SAK6Y8vPzhRB/O93+1FNPCYvFIgwGg5g1a5ZoaGjw+4zz58+LefPmibi4OGE0GsUjjzwi2tvbr7kGnmonCk2B7JuKEEJIzL4+8Xg8MJlMcLvdPP5DFEIC2TfD4mwXEQ09DB8ikoLhQ0RSMHyISAqGDxFJwfAhIikCvsiQiAaeEAI9PT1QVRWfffYZurq6/NqtVitMJhP0ej10uvAcQzB8iEKEqqro6OjAsWPHcOjQIRw5cgQdHR1oaWlBT0+PX9/4+HjExsYiPT0dkyZNwsyZM5GcnAy9Pnx2aV5kSCRZd3c33n//fbzxxhtwOBxobW2Foija9FWEEBBCwGAwID09HXfddRdycnIQHx//te8dCIHsmwwfIklUVUVNTQ22bNmC6upq+Hw+AOhzaPQGUWJiIn70ox/hvvvuQ3x8fDBL/loMH6IQJoSAx+PBc889h7KyMvT09AR1lNIbQjfddBOWL1+OzMxMREREBO3zvwpvryAKUaqqoqGhAf/yL/+Cbdu2BT14gL+NnHQ6HT7++GMUFhaitLTU7wEOoYLhQzRIVFXFX/7yFyxatAiNjY3XdEynPxRFgdfrxebNm/HYY49d8SN+sjF8iAZBb/D8+te/RkdHx6AdDO5dzpEjR/Czn/0spAKI4UM0wFRVxcsvv4xf//rX6OrqknIWSlEUHD16FI8//njIBBDDh2gACSHwwQcf4JlnnpEWPL0URcGxY8fwzDPPoLu7W1odvRg+RAPo7NmzeOqpp6QHzxdVVFTgxRdfhOwT3QwfogHi8/nw+9//HqdOnQqZ4FEUBaqqYvPmzWhoaJBaC8OHaAAIIbBv3z7s2bMnZIKnl6IoaG9vxzPPPCP1FDzDh2gAdHZ24o9//OMV92SFCkVRUFNTg/3790urgeFDFGRCCFRWVqKhoSHkRj2X++///m9pox+GD1GQqaqKsrIy2WV8LUVRUFdXh9raWinLDyh8SkpKMG3aNAwfPhyJiYmYM2fOFQetOjs7UVBQgISEBMTFxSE3N/eKxyc3NTUhJycHMTExSExMxIoVK0J2eEoUqNOnT6O+vj7kRz29KioqpJz5Cih8KisrUVBQgEOHDqG8vBzd3d2YPXs2Lly4oPVZvnw5Xn/9dZSVlaGyshKnT5/G3LlztXafz4ecnBx0dXXh4MGD2Lp1K7Zs2YKVK1cGb62IJBFC4MCBA7h06ZLsUq6Joig4cOAALl68OPjL7s9d7WfPnkViYiIqKyvxrW99C263G6NGjcJLL72EH/7whwCAjz76COPHj4fD4cD06dOxe/dufO9738Pp06dhsVgAAJs2bcLjjz+Os2fPIioq6muXy7vaKVQJIfDoo4/i8OHDYTPyUVUVW7ZsweTJk/v9WYN2V7vb7QYA7TdDampq0N3dDbvdrvVJT09HSkoKHA4HAMDhcGDixIla8ABAdnY2PB4P6urqrrocr9cLj8fjNxGFIpfLhU8++SRsgqdXTU3NoC+zz+GjqioKCwsxY8YMTJgwAQDgdDoRFRUFs9ns19discDpdGp9vhg8ve29bVdTUlICk8mkTcnJyX0tm2hAXbp0SftPOZy0tLQM+nGfPodPQUEBjh8/ju3btweznqsqLi6G2+3Wpubm5gFfJlFf1NfXh93JE0VRpNTdp1+bXrJkCXbu3In9+/djzJgx2nyr1Yquri64XC6/0U9LSwusVqvWp7q62u/zes+G9fa5nMFggMFg6EupRIPq/PnzUFU1rJ4ooSgKzp07B1VVB3W5AW0hIQSWLFmCHTt2YN++fUhNTfVrnzp1KiIjI1FRUaHNa2hoQFNTE2w2GwDAZrOhtrbW77b+8vJyGI1GZGRk9GddiCiMBDTyKSgowEsvvYRXX30Vw4cP147RmEwmREdHw2QyYcGCBSgqKkJ8fDyMRiOWLl0Km82G6dOnAwBmz56NjIwMzJ8/H+vWrYPT6cSTTz6JgoICjm6IriMBhU9paSkA4M477/Sbv3nzZvz4xz8GADz77LPQ6XTIzc2F1+tFdnY2Nm7cqPWNiIjAzp07sXjxYthsNsTGxiI/Px+rV6/u35oQUVgJKHyu5Wj4sGHDsGHDBmzYsOFL+4wdOxa7du0KZNFEYSEpKWnQnhQRLEIIKXWHz1ExojBw4403hmX4pKWlDfrTThk+REEUGxuLhIQE2WUERAiBlJSUQV8uw4coiIxGI2644QbpP1EaCJ1OF5RbKwJe7qAvkWiIu+uuuwb9mpm+6h313HjjjYO+bIYPURApioLbbrsNcXFxsku5JkIIfOtb35JymQvDhyjILBYLJk6cGBZfvXQ6HWbNmiXlRliGD1GQKYqCBx98MOTvbBdCYMqUKRg/fryU5TN8iIJMURRMnz4dkyZNCunRj6IoyM/Pv6bf0BoIDB+iARAVFYWFCxeG7C1Dqqri9ttvR1ZWlrQaGD5EA6B39DN37tyQG/0IITBy5EgUFhZKG/UADB+iAaPT6fCTn/wEaWlpIRNAQgjo9Xr867/+K8aOHSu1FoYP0QAym81YvXo1YmJiQiaAvve97+G+++6TfkCc4UM0gBRFwTe+8Q2sWrVKegAJIXD77bdj2bJlIXH/GcOHaIApigK73S41gHqDZ/Xq1TCZTIO+/KsZ3NtYia5TvQGkKAqefvppnD17dlC+9vQe47nzzjvxxBNPhEzwABz5EA0aRVEwa9YsvPDCC7j55pshhBjQUZAQAkajEcuWLcOaNWtCKngAhg/RoFIUBWPGjMHGjRuxdOlSxMTEBP0mVCEEVFVFZmYmSktL8c///M+D/ls916JfTyyVhU8spaFACIETJ07gT3/6E958801cvHgROp2uz1/HekdSN9xwA+bPnw+73Y7Y2NggV/3VAtk3GT5EkqmqipMnT+L//u//8Pbbb6OxsRE+n++agqh31GQ2mzFlyhTY7XbccccdiImJkXIqneFDFIaEEOjq6sLJkydx5MgROBwOtLe3o7GxEV1dXX59k5KSYDabMWHCBEyaNAmZmZkYOXKk9Gt3GD5EYa53t1RVFa2trVc8TXTEiBHaVyrZgfNFgeyboXcUioi0QImIiEBSUpLkagZGQGe7SktLMWnSJBiNRhiNRthsNuzevVtr7+zsREFBARISEhAXF4fc3FztUci9mpqakJOTg5iYGCQmJmLFihVh92xrIuq/gMJnzJgxWLt2LWpqanDkyBHcdddd+MEPfoC6ujoAwPLly/H666+jrKwMlZWVOH36NObOnau93+fzIScnB11dXTh48CC2bt2KLVu2YOXKlcFdKyIKfaKfRowYIV544QXhcrlEZGSkKCsr09o+/PBDAUA4HA4hhBC7du0SOp1OOJ1OrU9paakwGo3C6/Ve8zLdbrcAINxud3/LJ6IgCmTf7PNFhj6fD9u3b8eFCxdgs9lQU1OD7u5u2O12rU96ejpSUlLgcDgAAA6HAxMnToTFYtH6ZGdnw+PxaKOnq/F6vfB4PH4TEYW3gMOntrYWcXFxMBgMePTRR7Fjxw5kZGTA6XQiKioKZrPZr7/FYoHT6QQAOJ1Ov+Dpbe9t+zIlJSUwmUzalJycHGjZRBRiAg6fm266CUePHkVVVRUWL16M/Px81NfXD0RtmuLiYrjdbm1qbm4e0OUR0cAL+FR7VFSU9oCxqVOn4vDhw/jtb3+LBx54AF1dXXC5XH6jn5aWFlitVgCA1WpFdXW13+f1ng3r7XM1BoMhZH8Ll4j6pt83lqqqCq/Xi6lTpyIyMhIVFRVaW0NDA5qammCz2QAANpsNtbW1aG1t1fqUl5fDaDQiIyOjv6UQURgJaORTXFyMe+65BykpKWhvb8dLL72Et99+G3v37oXJZMKCBQtQVFSE+Ph4GI1GLF26FDabDdOnTwcAzJ49GxkZGZg/fz7WrVsHp9OJJ598EgUFBRzZEF1nAgqf1tZWPPzwwzhz5gxMJhMmTZqEvXv34jvf+Q4A4Nlnn4VOp0Nubi68Xi+ys7OxceNG7f0RERHYuXMnFi9eDJvNhtjYWOTn52P16tXBXSsiCnm8t4uIgiaQfZM/JkZEUjB8iEgKhg8RScHwISIpGD5EJAXDh4ikYPgQkRQMHyKSYkj/hrMQAj09PVBVFYqiIDIyMqR+bJvoejYkw0cIgQsXL+CtQ2/hL0f/gvaudkTro/H9id/H7NtmwzjcyBAikmzI3V4hhEBjcyN++ZdfoimmCRHmCCiKAiEEfO0+WNwWPHHvE8i4MYMBRBRk1/XtFefbzmPlKytxKukU9CP0WsAoigK9UY9z/+8cfr775/jr6b9KrpTo+jakwkcIgVcrX8UZ4xkouquPahSdgs/jP8ef3/qz9qhZIhp8Qyp8Pv/8c+z8ZCciYiK+sp8uSocKZwXOOM8MUmVEdLkhFT7t7e1ww/21x3IURcFF/UW0udoGqTIiutyQCh9Fp0DBtR1EVoQCnTKkVp8orAypvS/JmoRvRn8TwvfVJ/CEKpCKVKSOSx2kyojockMqfPR6PebZ5kE9/9UHkn0uH350y48QPSx6kCojossNqfBRFAWZkzJxX9J98Ll9uPwSJiEEfB0+zDbNxp1Zd/I6HyKJhtwVzpGRkVh07yJY9lnw8vGX0WJsASIB9ADxrnjMuWkOfvidH2LYsGGySyW6rg25K5x7CSHQ3t6Odw6/A/cFN2KHxeL2abdjhHkERzxEAySQK5yH3Minl6IoMBqNyJmVI7sUIrqKfh3zWbt2LRRFQWFhoTavs7MTBQUFSEhIQFxcHHJzc7VHIvdqampCTk4OYmJikJiYiBUrVqCnp6c/pRBRmOlz+Bw+fBh/+MMfMGnSJL/5y5cvx+uvv46ysjJUVlbi9OnTmDt3rtbu8/mQk5ODrq4uHDx4EFu3bsWWLVuwcuXKvq8FEYUf0Qft7e0iLS1NlJeXizvuuEMsW7ZMCCGEy+USkZGRoqysTOv74YcfCgDC4XAIIYTYtWuX0Ol0wul0an1KS0uF0WgUXq/3mpbvdrsFAOF2u/tSPhENkED2zT6NfAoKCpCTkwO73e43v6amBt3d3X7z09PTkZKSAofDAQBwOByYOHEiLBaL1ic7Oxsejwd1dXVXXZ7X64XH4/GbiCi8BXzAefv27Xjvvfdw+PDhK9qcTieioqJgNpv95lssFjidTq3PF4Ont7237WpKSkrwi1/8ItBSiSiEBTTyaW5uxrJly/Diiy8O6nUyxcXFcLvd2tTc3DxoyyaigRFQ+NTU1KC1tRW33HIL9Ho99Ho9KisrsX79euj1elgsFnR1dcHlcvm9r6WlBVarFQBgtVqvOPvV+7q3z+UMBgOMRqPfREThLaDwmTVrFmpra3H06FFtyszMRF5envbvyMhIVFRUaO9paGhAU1MTbDYbAMBms6G2thatra1an/LychiNRmRkZARptYgo1AV0zGf48OGYMGGC37zY2FgkJCRo8xcsWICioiLEx8fDaDRi6dKlsNlsmD59OgBg9uzZyMjIwPz587Fu3To4nU48+eSTKCgogMFgCNJqEVGoC/oVzs8++yx0Oh1yc3Ph9XqRnZ2NjRs3au0RERHYuXMnFi9eDJvNhtjYWOTn52P16tXBLoWIQtiQvbeLiAbfdf30CiIKDwwfIpKC4UNEUjB8iEgKhg8RScHwISIpGD5EJAXDh4ikYPgQkRQMHyKSguFDRFIwfIhICoYPEUnB8CEiKRg+RCQFw4eIpGD4EJEUDB8ikoLhQ0RSMHyISAqGDxFJwfAhIikCCp+f//znUBTFb0pPT9faOzs7UVBQgISEBMTFxSE3N/eKRyM3NTUhJycHMTExSExMxIoVK9DT0xOctSGisBHwQwO/+c1v4s033/zHB+j/8RHLly/HG2+8gbKyMphMJixZsgRz587Fu+++CwDw+XzIycmB1WrFwYMHcebMGTz88MOIjIzEmjVrgrA6RBQ2RABWrVolJk+efNU2l8slIiMjRVlZmTbvww8/FACEw+EQQgixa9cuodPphNPp1PqUlpYKo9EovF7vNdfhdrsFAOF2uwMpn4gGWCD7ZsDHfE6cOIHRo0fjn/7pn5CXl4empiYAQE1NDbq7u2G327W+6enpSElJgcPhAAA4HA5MnDgRFotF65OdnQ2Px4O6urovXabX64XH4/GbiCi8BRQ+WVlZ2LJlC/bs2YPS0lI0Njbi9ttvR3t7O5xOJ6KiomA2m/3eY7FY4HQ6AQBOp9MveHrbe9u+TElJCUwmkzYlJycHUjYRhaCAjvncc8892r8nTZqErKwsjB07Fn/+858RHR0d9OJ6FRcXo6ioSHvt8XgYQERhrl+n2s1mM77xjW/g5MmTsFqt6Orqgsvl8uvT0tICq9UKALBarVec/ep93dvnagwGA4xGo99EROGtX+HT0dGBTz75BElJSZg6dSoiIyNRUVGhtTc0NKCpqQk2mw0AYLPZUFtbi9bWVq1PeXk5jEYjMjIy+lMKEYWZgL52/fSnP8W9996LsWPH4vTp01i1ahUiIiIwb948mEwmLFiwAEVFRYiPj4fRaMTSpUths9kwffp0AMDs2bORkZGB+fPnY926dXA6nXjyySdRUFAAg8EwICtIRKEpoPD561//innz5uH8+fMYNWoUZs6ciUOHDmHUqFEAgGeffRY6nQ65ubnwer3Izs7Gxo0btfdHRERg586dWLx4MWw2G2JjY5Gfn4/Vq1cHd62IKOQpQgghu4hAeTwemEwmuN1uHv8hCiGB7Ju8t4uIpGD4EJEUDB8ikoLhQ0RSMHyISAqGDxFJwfAhIikYPkQkBcOHiKRg+BCRFAwfIpKC4UNEUjB8iEgKhg8RScHwISIpGD5EJAXDh4ikYPgQkRQMHyKSguFDRFIwfIhICoYPEUkRcPicOnUKDz30EBISEhAdHY2JEyfiyJEjWrsQAitXrkRSUhKio6Nht9tx4sQJv89oa2tDXl4ejEYjzGYzFixYgI6Ojv6vDRGFjYDC5/PPP8eMGTMQGRmJ3bt3o76+Hs888wxGjBih9Vm3bh3Wr1+PTZs2oaqqCrGxscjOzkZnZ6fWJy8vD3V1dSgvL8fOnTuxf/9+LFq0KHhrRUShTwTg8ccfFzNnzvzSdlVVhdVqFU8//bQ2z+VyCYPBILZt2yaEEKK+vl4AEIcPH9b67N69WyiKIk6dOnVNdbjdbgFAuN3uQMonogEWyL4Z0MjntddeQ2ZmJu6//34kJiZiypQpeP7557X2xsZGOJ1O2O12bZ7JZEJWVhYcDgcAwOFwwGw2IzMzU+tjt9uh0+lQVVV11eV6vV54PB6/iYjCW0Dh8+mnn6K0tBRpaWnYu3cvFi9ejMceewxbt24FADidTgCAxWLxe5/FYtHanE4nEhMT/dr1ej3i4+O1PpcrKSmByWTSpuTk5EDKJqIQpA+ks6qqyMzMxJo1awAAU6ZMwfHjx7Fp0ybk5+cPSIEAUFxcjKKiIu21x+MJywASQgAAzp49i7qaGlz8+0H20WPHIi0jA7GxsVAURWaJRIMmoPBJSkpCRkaG37zx48fj5ZdfBgBYrVYAQEtLC5KSkrQ+LS0tuPnmm7U+ra2tfp/R09ODtrY27f2XMxgMMBgMgZQacoQQOP7++9i/bRsu1NfDfOkS9H8Po091Orw5ahSSsrJwz0MPIX7kSIYQDXkBfe2aMWMGGhoa/OZ9/PHHGDt2LAAgNTUVVqsVFRUVWrvH40FVVRVsNhsAwGazweVyoaamRuuzb98+qKqKrKysPq9IKOvq6sLe//1fvLNqFRKOHUNqTw9GREZieFQUhkdFwarXY9znn0O88Qb+/MQTqD92TBslEQ1VAYXP8uXLcejQIaxZswYnT57ESy+9hOeeew4FBQUAAEVRUFhYiF/96ld47bXXUFtbi4cffhijR4/GnDlzAPxtpHT33Xdj4cKFqK6uxrvvvoslS5bgwQcfxOjRo4O+grKpqorXnn8ep/7wB4z0ehHxFSOaYRERGNHYiLdXrkTd++8zgGhICyh8pk2bhh07dmDbtm2YMGECfvnLX+I3v/kN8vLytD7/8R//gaVLl2LRokWYNm0aOjo6sGfPHgwbNkzr8+KLLyI9PR2zZs3Cd7/7XcycORPPPfdc8NYqRAgh8P7Bg2h54w0YIyKu6atUhE6HhM5OvFNaivPnzg1ClURyKCIM/3v1eDwwmUxwu90wGo2yy/lSLpcLLyxciBS3O+BjOF6fD+rs2Zi/YgWP/1DYCGTf5L1dA+ijDz5AzPnzfQoPQ0QEWqqr0d7ePgCVEcnH8BkgQggcf/ttmCMi+vwZ0W1taDh+PIhVEYUOhs8AUVUV5z/5BJG6vm/i4YqCv548GcSqiEIHw4eIpGD4DBBFURAVEwO1H8fzfUJgWExMEKsiCh0MnwGiKApumDkTF32+Pn9GW2QkJg7RCy+JGD4DRFEUTJ4xA+f7eMBZFQKRaWlIGoIXXhIBDJ8BNSY5GZZvfxudqhrQ+4QQ+DwiArc98AB0/ThgTRTK+Jc9gCIiIpD72GNoGzs2oGM/F1UV1vvvx9QZM3iBIQ1ZDJ8BFh0Tg/v+8z/hHDPma0dAQgicBxD5ne8ge948jnpoSONf9wBTFAXJqan4yfr1wLe/jWZVxcWeHr+RkE9V0dbdjaboaEwoLMSDP/0poqOjJVZNNPB4b9cg8vl8+OzTT3HswAE0vvsuuv/+Y2KmceMw/o478M2pUzFy1Ch+1aKwFci+yfCRQAgBn8+n/WRGREQEv2LRkBDIvhnQLxlScCiKAr2em56ub/zvloikYPgQkRQMHyKSguFDRFIwfIhICoYPEUnB8CEiKRg+RCQFw4eIpGD4EJEUDB8ikiIsbzDqvSHT4/FIroSIvqh3n7yW+9XDMnzOnz8PAEhOTpZcCRFdTXt7O0wm01f2CcvwiY+PBwA0NTV97QoOZR6PB8nJyWhubg7LnxYJFm6HvwmF7SCEQHt7O0Zfw4MPwjJ8en/7xmQyXdd/bL2MRiO3A7gdesneDtc6IOABZyKSguFDRFKEZfgYDAasWrUKBoNBdilScTv8DbfD34TbdgjL33AmovAXliMfIgp/DB8ikoLhQ0RSMHyISIqwDJ8NGzZg3LhxGDZsGLKyslBdXS27pKApKSnBtGnTMHz4cCQmJmLOnDloaGjw69PZ2YmCggIkJCQgLi4Oubm5aGlp8evT1NSEnJwcxMTEIDExEStWrEBPT89grkpQrV27FoqioLCwUJt3vWyHU6dO4aGHHkJCQgKio6MxceJEHDlyRGsXQmDlypVISkpCdHQ07HY7Tpw44fcZbW1tyMvLg9FohNlsxoIFC9Dx9yfmSiPCzPbt20VUVJT44x//KOrq6sTChQuF2WwWLS0tsksLiuzsbLF582Zx/PhxcfToUfHd735XpKSkiI6ODq3Po48+KpKTk0VFRYU4cuSImD59urjtttu09p6eHjFhwgRht9vF+++/L3bt2iVGjhwpiouLZaxSv1VXV4tx48aJSZMmiWXLlmnzr4ft0NbWJsaOHSt+/OMfi6qqKvHpp5+KvXv3ipMnT2p91q5dK0wmk3jllVfEsWPHxPe//32RmpoqLl26pPW5++67xeTJk8WhQ4fEO++8I2688UYxb948GaukCbvwufXWW0VBQYH22ufzidGjR4uSkhKJVQ2c1tZWAUBUVlYKIYRwuVwiMjJSlJWVaX0+/PBDAUA4HA4hhBC7du0SOp1OOJ1OrU9paakwGo3C6/UO7gr0U3t7u0hLSxPl5eXijjvu0MLnetkOjz/+uJg5c+aXtquqKqxWq3j66ae1eS6XSxgMBrFt2zYhhBD19fUCgDh8+LDWZ/fu3UJRFHHq1KmBK/5rhNXXrq6uLtTU1MBut2vzdDod7HY7HA6HxMoGjtvtBvCPm2lramrQ3d3ttw3S09ORkpKibQOHw4GJEyfCYrFofbKzs+HxeFBXVzeI1fdfQUEBcnJy/NYXuH62w2uvvYbMzEzcf//9SExMxJQpU/D8889r7Y2NjXA6nX7bwWQyISsry287mM1mZGZman3sdjt0Oh2qqqoGb2UuE1bhc+7cOfh8Pr8/JgCwWCxwOp2Sqho4qqqisLAQM2bMwIQJEwAATqcTUVFRMJvNfn2/uA2cTudVt1FvW7jYvn073nvvPZSUlFzRdr1sh08//RSlpaVIS0vD3r17sXjxYjz22GPYunUrgH+sx1ftE06nE4mJiX7ter0e8fHxUrdDWN7Vfr0oKCjA8ePHceDAAdmlDLrm5mYsW7YM5eXlGDZsmOxypFFVFZmZmVizZg0AYMqUKTh+/Dg2bdqE/Px8ydX1T1iNfEaOHImIiIgrzmi0tLTAarVKqmpgLFmyBDt37sRbb72FMWPGaPOtViu6urrgcrn8+n9xG1it1qtuo962cFBTU4PW1lbccsst0Ov10Ov1qKysxPr166HX62GxWK6L7ZCUlISMjAy/eePHj0dTUxOAf6zHV+0TVqsVra2tfu09PT1oa2uTuh3CKnyioqIwdepUVFRUaPNUVUVFRQVsNpvEyoJHCIElS5Zgx44d2LdvH1JTU/3ap06disjISL9t0NDQgKamJm0b2Gw21NbW+v3BlZeXw2g0XvGHHKpmzZqF2tpaHD16VJsyMzORl5en/ft62A4zZsy44lKLjz/+GGPHjgUApKamwmq1+m0Hj8eDqqoqv+3gcrlQU1Oj9dm3bx9UVUVWVtYgrMWXkHaou4+2b98uDAaD2LJli6ivrxeLFi0SZrPZ74xGOFu8eLEwmUzi7bffFmfOnNGmixcvan0effRRkZKSIvbt2yeOHDkibDabsNlsWnvvKebZs2eLo0ePij179ohRo0aF1Snmq/ni2S4hro/tUF1dLfR6vfiv//ovceLECfHiiy+KmJgY8ac//Unrs3btWmE2m8Wrr74qPvjgA/GDH/zgqqfap0yZIqqqqsSBAwdEWloaT7X3xe9+9zuRkpIioqKixK233ioOHToku6SgAXDVafPmzVqfS5cuiX/7t38TI0aMEDExMeK+++4TZ86c8fuczz77TNxzzz0iOjpajBw5Uvz7v/+76O7uHuS1Ca7Lw+d62Q6vv/66mDBhgjAYDCI9PV0899xzfu2qqoqnnnpKWCwWYTAYxKxZs0RDQ4Nfn/Pnz4t58+aJuLg4YTQaxSOPPCLa29sHczWuwJ/UICIpwuqYDxENHQwfIpKC4UNEUjB8iEgKhg8RScHwISIpGD5EJAXDh4ikYPgQkRQMHyKSguFDRFIwfIhIiv8PAS3LlPzClFgAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 300x300 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import gym\n",
    "\n",
    "\n",
    "#定义环境\n",
    "class MyWrapper(gym.Wrapper):\n",
    "\n",
    "    def __init__(self):\n",
    "        from pettingzoo.mpe import simple_tag_v3\n",
    "        env = simple_tag_v3.env(num_good=1,\n",
    "                                num_adversaries=1,\n",
    "                                num_obstacles=1,\n",
    "                                max_cycles=1e8,\n",
    "                                render_mode='rgb_array')\n",
    "        super().__init__(env)\n",
    "        self.env = env\n",
    "        self.step_n = 0\n",
    "\n",
    "    def reset(self):\n",
    "        self.env.reset()\n",
    "        self.step_n = 0\n",
    "        return self.state()\n",
    "\n",
    "    def state(self):\n",
    "        state = []\n",
    "        for i in self.env.agents:\n",
    "            state.append(env.observe(i).tolist())\n",
    "        state[-1].extend([0.0, 0.0])\n",
    "        return state\n",
    "\n",
    "    def step(self, action):\n",
    "        reward_sum = [0, 0]\n",
    "        for i in range(5):\n",
    "            if i != 0:\n",
    "                action = [-1, -1]\n",
    "            next_state, reward, over = self._step(action)\n",
    "            for j in range(2):\n",
    "                reward_sum[j] += reward[j]\n",
    "            self.step_n -= 1\n",
    "\n",
    "        self.step_n += 1\n",
    "\n",
    "        return next_state, reward_sum, over\n",
    "\n",
    "    def _step(self, action):\n",
    "        for i, _ in enumerate(env.agent_iter(2)):\n",
    "            self.env.step(action[i] + 1)\n",
    "\n",
    "        reward = [self.env.rewards[i] for i in self.env.agents]\n",
    "\n",
    "        _, _, termination, truncation, _ = env.last()\n",
    "        over = termination or truncation\n",
    "\n",
    "        #限制最大步数\n",
    "        self.step_n += 1\n",
    "        if self.step_n >= 100:\n",
    "            over = True\n",
    "\n",
    "        return self.state(), reward, over\n",
    "\n",
    "    #打印游戏图像\n",
    "    def show(self):\n",
    "        from matplotlib import pyplot as plt\n",
    "        plt.figure(figsize=(3, 3))\n",
    "        plt.imshow(self.env.render())\n",
    "        plt.show()\n",
    "\n",
    "\n",
    "env = MyWrapper()\n",
    "env.reset()\n",
    "\n",
    "env.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[<__main__.A2C at 0x7f979661eca0>, <__main__.A2C at 0x7f97965b5880>]"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import torch\n",
    "\n",
    "\n",
    "class A2C:\n",
    "\n",
    "    def __init__(self, model_actor, model_critic, model_critic_delay,\n",
    "                 optimizer_actor, optimizer_critic):\n",
    "        self.model_actor = model_actor\n",
    "        self.model_critic = model_critic\n",
    "        self.model_critic_delay = model_critic_delay\n",
    "        self.optimizer_actor = optimizer_actor\n",
    "        self.optimizer_critic = optimizer_critic\n",
    "\n",
    "        self.model_critic_delay.load_state_dict(self.model_critic.state_dict())\n",
    "        self.requires_grad(self.model_critic_delay, False)\n",
    "\n",
    "    def soft_update(self, _from, _to):\n",
    "        for _from, _to in zip(_from.parameters(), _to.parameters()):\n",
    "            value = _to.data * 0.99 + _from.data * 0.01\n",
    "            _to.data.copy_(value)\n",
    "\n",
    "    def requires_grad(self, model, value):\n",
    "        for param in model.parameters():\n",
    "            param.requires_grad_(value)\n",
    "\n",
    "    def train_critic(self, state, reward, next_state, over):\n",
    "        self.requires_grad(self.model_critic, True)\n",
    "        self.requires_grad(self.model_actor, False)\n",
    "\n",
    "        #计算values和targets\n",
    "        value = self.model_critic(state)\n",
    "\n",
    "        with torch.no_grad():\n",
    "            target = self.model_critic_delay(next_state)\n",
    "        target = target * 0.99 * (1 - over) + reward\n",
    "\n",
    "        #时序差分误差,也就是tdloss\n",
    "        loss = torch.nn.functional.mse_loss(value, target)\n",
    "\n",
    "        loss.backward()\n",
    "        self.optimizer_critic.step()\n",
    "        self.optimizer_critic.zero_grad()\n",
    "        self.soft_update(self.model_critic, self.model_critic_delay)\n",
    "\n",
    "        #减去value相当于去基线\n",
    "        return (target - value).detach()\n",
    "\n",
    "    def train_actor(self, state, action, value):\n",
    "        self.requires_grad(self.model_critic, False)\n",
    "        self.requires_grad(self.model_actor, True)\n",
    "\n",
    "        #重新计算动作的概率\n",
    "        prob = self.model_actor(state)\n",
    "        prob = prob.gather(dim=1, index=action)\n",
    "\n",
    "        #根据策略梯度算法的导函数实现\n",
    "        #函数中的Q(state,action),这里使用critic模型估算\n",
    "        prob = (prob + 1e-8).log() * value\n",
    "        loss = -prob.mean()\n",
    "\n",
    "        loss.backward()\n",
    "        self.optimizer_actor.step()\n",
    "        self.optimizer_actor.zero_grad()\n",
    "\n",
    "        return loss.item()\n",
    "\n",
    "\n",
    "model_actor = [\n",
    "    torch.nn.Sequential(\n",
    "        torch.nn.Linear(10, 64),\n",
    "        torch.nn.ReLU(),\n",
    "        torch.nn.Linear(64, 64),\n",
    "        torch.nn.ReLU(),\n",
    "        torch.nn.Linear(64, 4),\n",
    "        torch.nn.Softmax(dim=1),\n",
    "    ) for _ in range(2)\n",
    "]\n",
    "\n",
    "model_critic = [\n",
    "    torch.nn.Sequential(\n",
    "        torch.nn.Linear(10, 64),\n",
    "        torch.nn.ReLU(),\n",
    "        torch.nn.Linear(64, 64),\n",
    "        torch.nn.ReLU(),\n",
    "        torch.nn.Linear(64, 1),\n",
    "    ) for _ in range(2)\n",
    "]\n",
    "\n",
    "model_critic_delay = [\n",
    "    torch.nn.Sequential(\n",
    "        torch.nn.Linear(10, 64),\n",
    "        torch.nn.ReLU(),\n",
    "        torch.nn.Linear(64, 64),\n",
    "        torch.nn.ReLU(),\n",
    "        torch.nn.Linear(64, 1),\n",
    "    ) for _ in range(2)\n",
    "]\n",
    "\n",
    "optimizer_actor = [\n",
    "    torch.optim.Adam(model_actor[i].parameters(), lr=1e-3) for i in range(2)\n",
    "]\n",
    "\n",
    "optimizer_critic = [\n",
    "    torch.optim.Adam(model_critic[i].parameters(), lr=5e-3) for i in range(2)\n",
    "]\n",
    "\n",
    "a2c = [\n",
    "    A2C(model_actor[i], model_critic[i], model_critic_delay[i],\n",
    "        optimizer_actor[i], optimizer_critic[i]) for i in range(2)\n",
    "]\n",
    "\n",
    "model_actor = None\n",
    "model_critic = None\n",
    "model_critic_delay = None\n",
    "optimizer_actor = None\n",
    "optimizer_critic = None\n",
    "\n",
    "a2c"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[30.0, -681.0654296875]"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from IPython import display\n",
    "import random\n",
    "\n",
    "\n",
    "#玩一局游戏并记录数据\n",
    "def play(show=False):\n",
    "    state = []\n",
    "    action = []\n",
    "    reward = []\n",
    "    next_state = []\n",
    "    over = []\n",
    "\n",
    "    s = env.reset()\n",
    "    o = False\n",
    "    while not o:\n",
    "        a = []\n",
    "        for i in range(2):\n",
    "            #计算动作\n",
    "            prob = a2c[i].model_actor(torch.FloatTensor(s[i]).reshape(\n",
    "                1, -1))[0].tolist()\n",
    "            a.append(random.choices(range(4), weights=prob, k=1)[0])\n",
    "\n",
    "        #执行动作\n",
    "        ns, r, o = env.step(a)\n",
    "\n",
    "        state.append(s)\n",
    "        action.append(a)\n",
    "        reward.append(r)\n",
    "        next_state.append(ns)\n",
    "        over.append(o)\n",
    "\n",
    "        s = ns\n",
    "\n",
    "        if show:\n",
    "            display.clear_output(wait=True)\n",
    "            env.show()\n",
    "\n",
    "    state = torch.FloatTensor(state)\n",
    "    action = torch.LongTensor(action).unsqueeze(-1)\n",
    "    reward = torch.FloatTensor(reward).unsqueeze(-1)\n",
    "    next_state = torch.FloatTensor(next_state)\n",
    "    over = torch.LongTensor(over).reshape(-1, 1)\n",
    "\n",
    "    return state, action, reward, next_state, over, reward.sum(\n",
    "        dim=0).flatten().tolist()\n",
    "\n",
    "\n",
    "state, action, reward, next_state, over, reward_sum = play()\n",
    "\n",
    "reward_sum"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0 -2.52531099319458 [6.0, -1572.4345703125]\n",
      "2500 -2.2057106494903564 [133.5, -160.2089385986328]\n",
      "5000 -0.4346385896205902 [68.5, -88.4798355102539]\n",
      "7500 -0.7878077626228333 [36.0, -57.25299072265625]\n",
      "10000 -0.4243254065513611 [20.0, -28.127361297607422]\n",
      "12500 -0.10560104995965958 [19.5, -21.97000503540039]\n",
      "15000 -0.20183932781219482 [15.5, -17.28420639038086]\n",
      "17500 -0.09082813560962677 [8.0, -17.061660766601562]\n",
      "20000 0.04816720634698868 [12.5, -17.330961227416992]\n",
      "22500 -0.03879740089178085 [8.0, -12.067501068115234]\n",
      "25000 -0.19893258810043335 [19.0, -19.506542205810547]\n",
      "27500 -0.038775209337472916 [14.0, -17.891183853149414]\n",
      "30000 0.018680671229958534 [16.5, -18.37790298461914]\n",
      "32500 -0.044296279549598694 [1.5, -9.976574897766113]\n",
      "35000 -0.007319218944758177 [11.5, -14.802574157714844]\n",
      "37500 0.001401672256179154 [5.0, -9.788602828979492]\n",
      "40000 -0.32673946022987366 [12.5, -14.883753776550293]\n",
      "42500 -0.012928042560815811 [8.5, -8.822985649108887]\n",
      "45000 0.08175945281982422 [10.5, -13.748283386230469]\n",
      "47500 -0.04752933979034424 [13.0, -21.088510513305664]\n"
     ]
    }
   ],
   "source": [
    "def train():\n",
    "    #训练N局\n",
    "    for epoch in range(5_0000):\n",
    "        state, action, reward, next_state, over, _ = play()\n",
    "\n",
    "        for i in range(2):\n",
    "            value = a2c[i].train_critic(state[:, i], reward[:, i],\n",
    "                                        next_state[:, i], over)\n",
    "            loss = a2c[i].train_actor(state[:, i], action[:, i], value)\n",
    "\n",
    "        if epoch % 2500 == 0:\n",
    "            test_result = [play()[-1] for _ in range(20)]\n",
    "            test_result = torch.FloatTensor(test_result).mean(dim=0).tolist()\n",
    "            print(epoch, loss, test_result)\n",
    "\n",
    "\n",
    "train()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAR8AAAEYCAYAAABlUvL1AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/P9b71AAAACXBIWXMAAA9hAAAPYQGoP6dpAAAj7UlEQVR4nO3df3RT9f0/8Gd+Nf1FElpo0n5poRudpVAQqJaIP6ZkVOw2Gd0vrMD4cOTI2ipUOa7nozCZWg5+Ns9wDqZT4RxxzJ5NGYwf64qWOUILddVSsKLCaQWSCrVJizRpet/fP1yvBgo0pe07kefjnHsOve9Xkte99D57c+9NrkYIIUBENMy0shsgomsTw4eIpGD4EJEUDB8ikoLhQ0RSMHyISAqGDxFJwfAhIikYPkQkBcOHiKSQFj7PPfccxo0bh+joaOTm5qK2tlZWK0QkgZTw+fOf/4zS0lKsXr0a77zzDqZMmYK8vDy0trbKaIeIJNDI+GBpbm4ubrjhBvzud78DACiKgtTUVJSUlOAXv/jFcLdDRBLoh/sF/X4/6urqUFZWps7TarVwOBxwOp19Psbn88Hn86k/K4qCtrY2JCYmQqPRDHnPRNQ/Qgh0dHQgJSUFWu3l31gNe/icOXMGPT09sFqtQfOtVivef//9Ph9TXl6Oxx9/fDjaI6JB0NLSgjFjxly2ZtjDZyDKyspQWlqq/uzxeJCWloaWlhaYTKYhe11FUfD0009j+/btV0xxWYQQGD16NP74xz8iISFBdjt0jfN6vUhNTcWIESOuWDvs4TNq1CjodDq43e6g+W63Gzabrc/HGI1GGI3Gi+abTKYhDZ/jx4/jrbfegl6vD+u3d2fOnMGbb76JxYsXy26FCAD6tb0M+5/zqKgoTJ8+HVVVVeo8RVFQVVUFu90+3O1c1r59+9DZ2RnWwQN8cczsn//8Z9BxMaJwJ+W9RGlpKV544QVs3rwZR48exbJly3Du3Lmw+svt9/uxd+9e6HQ62a1ckUajwbFjx3DixAnZrRD1m5RjPj/5yU/w6aefYtWqVXC5XLj++uuxe/fuiw5Cy9TS0oKmpqaw3+vp5ff7sW/fPlx33XWyWyHqF2kHnIuLi1FcXCzr5a/o6NGj+Pzzz2EwGGS30i9arRbvvPMOenp6ImJvjSg8T+GEgf3790Ovj4iTgarGxkZ0dHTIboOoXxg+fVAUBZ2dnbLbCIlGo0EgEEBXV5fsVoj6heHTh3PnzqGhoUF2GyHr7OzE4cOHZbdB1C8Mn0tQFCViDjZ/laIoslsg6heGDxFJwfAhIikYPpeg0WgQibexj8S3inRtYvj0ITY2FhMnTpTdRsji4uKQlZUluw2ifmH49EGn0/XrU7nhRAgBvV6PuLg42a0Q9QvD5xJmzJiBnp4e2W2EZMKECYiPj5fdBlG/MHwuYcKECX1+jUe4UhQFU6ZMibirsunaxfC5hHHjxiEjIyNiDjobDAbcdtttstsg6jeGzyUYjUZ8+9vfjoiL9oQQSE9Px/jx42W3QtRvDJ/LuPXWWxEdHR32ez+KouD2229HdHS07FaI+o3hcxnjx4/HrFmzwjp8hBAYNWoU5s2bJ7sVopAwfC5Dp9Nh4cKFMBqNYRtAiqKgoKAASUlJslshCgnD5wrGjx+PhQsXhuWxHyEEMjMzMX/+fF7ZTBGH4XMFOp0O9957LyZOnBhWez9CCBgMBjz88MOwWCyy2yEKGcOnH0wmE37xi18gISEhbAJIo9Fg6dKlmDp1quxWiAaE4dNPkyZNwn333Qe9Xi89gIQQsNvtuOeee/h9zRSxGD79pNFo8MMf/hAlJSVSA6g3eJ544gnExMRI6YFoMDB8QqDT6XDPPfegpKQEOp1u2ANIURTY7XY8+eSTPM5DES/k8Nm3bx++973vISUlBRqNBm+88UbQuBACq1atQnJyMmJiYuBwOHDs2LGgmra2NhQWFsJkMsFisWDJkiUR84XtvQH0xBNPYPTo0cNyFkwIAY1GgwULFjB46Gsj5PA5d+4cpkyZgueee67P8XXr1mH9+vXYuHEjampqEBcXh7y8vKC7KhQWFqKxsRGVlZXYsWMH9u3bh6VLlw58KYaZTqfD7Nmz8eyzz8Jutw/pHpAQAqNHj8aaNWvw4IMPMnjoa0MjrmLL0Wg0eP311zF37lwAX2woKSkpeOihh/Dwww8DADweD6xWKzZt2oSf/vSnOHr0KLKysnDw4EHk5OQAAHbv3o277roLn3zyCVJSUq74ul6vF2azGR6PByaTaaDtD4rz58/jL3/5CzZt2oSzZ89Co9EMyjU3iqLAYDDg9ttvx4oVK2C1WnktD4W9ULbNQT3mc/z4cbhcLjgcDnWe2WxGbm4unE4nAMDpdMJisajBAwAOhwNarRY1NTV9Pq/P54PX6w2awkVMTAwKCwvx6quvYunSpUhMTISiKFAUJaQ9IiEEhBAIBAKIiorCrFmz8OKLL+Kpp56CzWZj8NDXzqB++YvL5QKAi+65brVa1TGXy3XRRwH0ej0SEhLUmguVl5fj8ccfH8xWB5VGo0FSUhLuv/9+3HPPPaivr0dlZSVqamrw2Wefwe/3Q6vVQqu9OOsDgQC0Wq36FagOhwMzZ85EcnIyT6PT11pEfPNUWVkZSktL1Z+9Xi9SU1MldtQ3jUYDs9mM2267Dbfeeiu8Xi+am5tx4sQJHD9+HE1NTUH1ZrMZM2bMwIgRI5CdnQ2LxcIvA6NrxqD+pttsNgCA2+1GcnKyOt/tduP6669Xa1pbW4MeFwgE0NbWpj7+QkajMaK+VRD4Moiys7ORnZ0tux2isDOox3zS09Nhs9lQVVWlzvN6vaipqYHdbgcA2O12tLe3o66uTq3Zu3cvFEVBbm7uYLZDRGEs5D2fzs5OfPjhh+rPx48fR319PRISEpCWlobly5fjiSeeQEZGBtLT0/HYY48hJSVFPSM2YcIE3HnnnbjvvvuwceNGdHd3o7i4GD/96U/7daaLiL4mRIjefPNNAeCiadGiRUIIIRRFEY899piwWq3CaDSKWbNmiaampqDnOHv2rJg/f76Ij48XJpNJLF68WHR0dPS7B4/HIwAIj8cTavtENIRC2Tav6jofWcLpOh8i+pK063yIiPqL4UNEUjB8iEgKhg8RScHwISIpGD5EJAXDh4ikYPgQkRQMHyKSguFDRFIwfIhICoYPEUnB8CEiKRg+RCQFw4eIpGD4EJEUDB8ikoLhQ0RSMHyISAqGDxFJwfAhIikYPkQkRUjhU15ejhtuuAEjRoxAUlIS5s6de9H9x7u6ulBUVITExETEx8ejoKAAbrc7qKa5uRn5+fmIjY1FUlISVq5ciUAgcPVLQ0QRI6Twqa6uRlFREQ4cOIDKykp0d3dj9uzZOHfunFqzYsUKbN++HRUVFaiursapU6cwb948dbynpwf5+fnw+/3Yv38/Nm/ejE2bNmHVqlWDt1REFP6u5u6Era2tAoCorq4WQgjR3t4uDAaDqKioUGuOHj0qAAin0ymEEGLnzp1Cq9UKl8ul1mzYsEGYTCbh8/n69bq8YylReApl27yqYz4ejwcAkJCQAACoq6tDd3c3HA6HWpOZmYm0tDQ4nU4AgNPpRHZ2NqxWq1qTl5cHr9eLxsbGPl/H5/PB6/UGTUQU2QYcPoqiYPny5Zg5cyYmTZoEAHC5XIiKioLFYgmqtVqtcLlcas1Xg6d3vHesL+Xl5TCbzeqUmpo60LaJKEwMOHyKiopw+PBhbN26dTD76VNZWRk8Ho86tbS0DPlrEtHQ0g/kQcXFxdixYwf27duHMWPGqPNtNhv8fj/a29uD9n7cbjdsNptaU1tbG/R8vWfDemsuZDQaYTQaB9IqEYWpkPZ8hBAoLi7G66+/jr179yI9PT1ofPr06TAYDKiqqlLnNTU1obm5GXa7HQBgt9vR0NCA1tZWtaayshImkwlZWVlXsyxEFEFC2vMpKirCq6++im3btmHEiBHqMRqz2YyYmBiYzWYsWbIEpaWlSEhIgMlkQklJCex2O2bMmAEAmD17NrKysrBgwQKsW7cOLpcLjz76KIqKirh3Q3QtCeU0GoA+p5dfflmtOX/+vPj5z38uRo4cKWJjY8UPfvADcfr06aDnOXHihJgzZ46IiYkRo0aNEg899JDo7u7udx881U4UnkLZNjVCCCEv+gbG6/XCbDbD4/HAZDLJboeI/iuUbZOf7SIiKRg+RCQFw4eIpGD4EJEUDB8ikoLhQ0RSMHyISAqGDxFJwfAhIikYPkQkBcOHiKRg+BCRFAwfIpKC4UNEUjB8iEgKhg8RScHwISIpGD5EJAXDh4ikYPgQkRQMHyKSguFDRFKEFD4bNmzA5MmTYTKZYDKZYLfbsWvXLnW8q6sLRUVFSExMRHx8PAoKCtRbIfdqbm5Gfn4+YmNjkZSUhJUrVyIQCAzO0hBRxAgpfMaMGYO1a9eirq4Ohw4dwh133IG7774bjY2NAIAVK1Zg+/btqKioQHV1NU6dOoV58+apj+/p6UF+fj78fj/279+PzZs3Y9OmTVi1atXgLhURhb+rvUPhyJEjxR//+EfR3t4uDAaDqKioUMeOHj0qAAin0ymEEGLnzp1Cq9UKl8ul1mzYsEGYTCbh8/n6/Zq8YylReApl2xzwMZ+enh5s3boV586dg91uR11dHbq7u+FwONSazMxMpKWlwel0AgCcTieys7NhtVrVmry8PHi9XnXvqS8+nw9erzdoIqLIFnL4NDQ0ID4+HkajEffffz9ef/11ZGVlweVyISoqChaLJajearXC5XIBAFwuV1Dw9I73jl1KeXk5zGazOqWmpobaNhGFmZDD57rrrkN9fT1qamqwbNkyLFq0CEeOHBmK3lRlZWXweDzq1NLSMqSvR0RDTx/qA6KiojB+/HgAwPTp03Hw4EH89re/xU9+8hP4/X60t7cH7f243W7YbDYAgM1mQ21tbdDz9Z4N663pi9FohNFoDLVVIgpjV32dj6Io8Pl8mD59OgwGA6qqqtSxpqYmNDc3w263AwDsdjsaGhrQ2tqq1lRWVsJkMiErK+tqWyGiCBLSnk9ZWRnmzJmDtLQ0dHR04NVXX8Vbb72FPXv2wGw2Y8mSJSgtLUVCQgJMJhNKSkpgt9sxY8YMAMDs2bORlZWFBQsWYN26dXC5XHj00UdRVFTEPRuia0xI4dPa2oqFCxfi9OnTMJvNmDx5Mvbs2YPvfOc7AIBnnnkGWq0WBQUF8Pl8yMvLw+9//3v18TqdDjt27MCyZctgt9sRFxeHRYsWYc2aNYO7VEQU9jRCCCG7iVB5vV6YzWZ4PB6YTCbZ7RDRf4WybfKzXUQkBcOHiKRg+BCRFCFf50Mkm6IoUBQFAKDVaqHV8m9oJGL4UMRQFAUtJ1vw2luvof5kPQAgOzkbP/72jzEudRxDKMIwfCgiBAIBbNm9BX9u+jP8Vj+039BCo9Ggyl+F6r9Uo+CbBVh01yIYDAbZrVI/8U8FhT0hBCoqK7DFvQWB1AB0Rh00Gg0AQBulRc+YHrx29jVs2bUFEXjlyDWL4UNh78zZM6horABGQA2dr9JoNEA88Ndjf4XLfelvR6DwwvChsLd933Z4Rnn6DJ5eGo0G50afwxvVbwxfY3RVGD4U9j5yfQRdrO6KdbpoHT5yfzQMHdFgYPhQ2NNqtUA/D+XwjFfk4P8Uhb3bJ9+OnrM9V6wLfBbA7dm3D0NHNBgYPhT2bpp2E77p/yaEcundHyEE0s6l4bYbbhvGzuhqMHwo7EVHR6P4zmKMODsCSo9y0bjoEYg7E4eS2SWIjY2V0CENBMOHIkL2ddlY41iD7LPZUFwKAl0BBLoC6HH3YMKZCXj89scxbeI02W1SCPh9PhRRAoEA6hvr8e4H70JAIHt8NqZNmsYrm8NEKNsmP15BEUWv1yNnSg5ypuTIboWuEt92EZEU3POhqyKEwOeff44P338f3vZ2AEDCqFFIz8jgwV+6LIYPDdiplhZUVVTAVVsLY2sronq+uBbHp9ej22bD/5sxA7N+/GMkXXCXWiKA4UMDoCgK6mtr4Xz2WZjdbozVaqExGICvHPQVn34K37Zt+Mt//oNbSkow8frrL/vZLLr28JgPhezfe/bAuXo1Rn36KYw63SU/aR6t02HkJ5/grf/9Xxz6178kdErh7KrCZ+3atdBoNFi+fLk6r6urC0VFRUhMTER8fDwKCgrUWyL3am5uRn5+PmJjY5GUlISVK1ciEAhcTSs0TJo//hj/eekljBYC2n7syeg0GiQFAjjw/PNwnTw5DB1SpBhw+Bw8eBB/+MMfMHny5KD5K1aswPbt21FRUYHq6mqcOnUK8+bNU8d7enqQn58Pv9+P/fv3Y/Pmzdi0aRNWrVo18KWgYeH3+7HtN7+B1XP5r7e4kEajwSi3G9t+9zv+kSHVgMKns7MThYWFeOGFFzBy5Eh1vsfjwYsvvojf/OY3uOOOOzB9+nS8/PLL2L9/Pw4cOAAA+Mc//oEjR47glVdewfXXX485c+bgV7/6FZ577jn4/f7BWSoaEidbWuBvaurXHs+FdBoNPO++izOffjoEnVEkGlD4FBUVIT8/Hw6HI2h+XV0duru7g+ZnZmYiLS0NTqcTAOB0OpGdnQ3rV86A5OXlwev1orGxsc/X8/l88Hq9QRMNv/f270dCz5U/Xd4XjUaDkefP4/DBg4PcFUWqkMNn69ateOedd1BeXn7RmMvlQlRUFCwWS9B8q9UKl8ul1lgvOPXa+3NvzYXKy8thNpvVKTU1NdS2aRA0NzQgTj/wE6QjDAacaGgYxI4okoUUPi0tLXjwwQexZcsWREdHD1VPFykrK4PH41GnlpaWYXttIhoaIYVPXV0dWltbMW3aNOj1euj1elRXV2P9+vXQ6/WwWq3w+/1o/++Vrr3cbjdsNhsAwGazXXT2q/fn3poLGY1GmEymoImGn8FoxMVfaNF/ihAwGI2D1g9FtpDCZ9asWWhoaEB9fb065eTkoLCwUP23wWBAVVWV+pimpiY0NzfDbrcDAOx2OxoaGtDa2qrWVFZWwmQyISsra5AWi4ZC1q23or27e8CPPxsIIPuWWwaxI4pkIb2BHzFiBCZNmhQ0Ly4uDomJier8JUuWoLS0FAkJCTCZTCgpKYHdbseMGTMAALNnz0ZWVhYWLFiAdevWweVy4dFHH0VRURGM/KsY1iZNmwZnXBwS/P6Qr1YWQuDzxER8a+LEIeqOIs2gX+H8zDPP4Lvf/S4KCgpw6623wmaz4a9//as6rtPpsGPHDuh0Otjtdtx7771YuHAh1qxZM9it0CCzjByJcbNn47wQId2cTwiBc0Ig87vfRVxc3BB2SJGEXyZGIenq6sJLjzyChKNHoevn3k9AUdB5441YtHo1oqKihrhDkimUbZOf7aKQREdHY+5DD6F1zBicVy5/+FkIgU5FQVtGBuY+8ACDh4IwfChkKWlp+J/166H/zndwsqcH53t6gt6GCSHweSCAk0LANHcuFv/f/2E0v1aDLsC3XTRggUAAx95/H+/u24dPDhyA/7+XWBgTEzH2ppsw+ZZb8M2MDOh0V77bKH09hLJtMnzoqgkh4PP50PPfj17o9XqeubxG8QvkaVhpNJphveKdvh54zIeIpGD4EJEUDB8ikoLhQ0RSMHyISAqGDxFJwfAhIikYPkQkBcOHiKRg+BCRFAwfIpKC4UNEUjB8iEgKhg8RScHwISIpGD5EJAXDh4ikCCl8fvnLX0Kj0QRNmZmZ6nhXVxeKioqQmJiI+Ph4FBQUXHRr5ObmZuTn5yM2NhZJSUlYuXIlAoHA4CwNEUWMkL9GdeLEifjnP//55RPov3yKFStW4O9//zsqKipgNptRXFyMefPm4d///jcAoKenB/n5+bDZbNi/fz9Onz6NhQsXwmAw4KmnnhqExSGiiCFCsHr1ajFlypQ+x9rb24XBYBAVFRXqvKNHjwoAwul0CiGE2Llzp9BqtcLlcqk1GzZsECaTSfh8vn734fF4BADh8XhCaZ+Ihlgo22bIx3yOHTuGlJQUfOMb30BhYSGam5sBAHV1deju7obD4VBrMzMzkZaWBqfTCQBwOp3Izs6G9Sv3cMrLy4PX60VjY+MlX9Pn88Hr9QZNRBTZQgqf3NxcbNq0Cbt378aGDRtw/Phx3HLLLejo6IDL5UJUVBQsFkvQY6xWK1wuFwDA5XIFBU/veO/YpZSXl8NsNqtTampqKG0TURgK6ZjPnDlz1H9PnjwZubm5GDt2LF577TXExMQMenO9ysrKUFpaqv7s9XoZQEQR7qpOtVssFnzrW9/Chx9+CJvNBr/fj/b/3rWyl9vths1mAwDYbLaLzn71/txb0xej0QiTyRQ0EVFku6rw6ezsxEcffYTk5GRMnz4dBoMBVVVV6nhTUxOam5tht9sBAHa7HQ0NDWhtbVVrKisrYTKZkJWVdTWtEFGECelt18MPP4zvfe97GDt2LE6dOoXVq1dDp9Nh/vz5MJvNWLJkCUpLS5GQkACTyYSSkhLY7XbMmDEDADB79mxkZWVhwYIFWLduHVwuFx599FEUFRXx9rpE15iQwueTTz7B/PnzcfbsWYwePRo333wzDhw4gNGjRwMAnnnmGWi1WhQUFMDn8yEvLw+///3v1cfrdDrs2LEDy5Ytg91uR1xcHBYtWoQ1a9YM7lIRUdjTCCGE7CZCFcrN6Ilo+ISybfKzXUQkBcOHiKRg+BCRFAwfIpKC4UNEUjB8iEgKhg8RScHwISIpGD5EJAXDh4ikYPgQkRQMHyKSguFDRFIwfIhICoYPEUnB8CEiKRg+RCQFw4eIpGD4EJEUDB8ikoLhQ0RSMHyISIqQw+fkyZO49957kZiYiJiYGGRnZ+PQoUPquBACq1atQnJyMmJiYuBwOHDs2LGg52hra0NhYSFMJhMsFguWLFmCzs7Oq18aIooYIYXPZ599hpkzZ8JgMGDXrl04cuQIfv3rX2PkyJFqzbp167B+/Xps3LgRNTU1iIuLQ15eHrq6utSawsJCNDY2orKyEjt27MC+ffuwdOnSwVsqIgp/IgSPPPKIuPnmmy85riiKsNls4umnn1bntbe3C6PRKP70pz8JIYQ4cuSIACAOHjyo1uzatUtoNBpx8uTJfvXh8XgEAOHxeEJpn4iGWCjbZkh7Pn/729+Qk5ODH/3oR0hKSsLUqVPxwgsvqOPHjx+Hy+WCw+FQ55nNZuTm5sLpdAIAnE4nLBYLcnJy1BqHwwGtVouampo+X9fn88Hr9QZNRBTZQgqfjz/+GBs2bEBGRgb27NmDZcuW4YEHHsDmzZsBAC6XCwBgtVqDHme1WtUxl8uFpKSkoHG9Xo+EhAS15kLl5eUwm83qlJqaGkrbRBSGQgofRVEwbdo0PPXUU5g6dSqWLl2K++67Dxs3bhyq/gAAZWVl8Hg86tTS0jKkr0dEQy+k8ElOTkZWVlbQvAkTJqC5uRkAYLPZAAButzuoxu12q2M2mw2tra1B44FAAG1tbWrNhYxGI0wmU9BERJEtpPCZOXMmmpqaguZ98MEHGDt2LAAgPT0dNpsNVVVV6rjX60VNTQ3sdjsAwG63o729HXV1dWrN3r17oSgKcnNzB7wgRBRhQjmSXVtbK/R6vXjyySfFsWPHxJYtW0RsbKx45ZVX1Jq1a9cKi8Uitm3bJt577z1x9913i/T0dHH+/Hm15s477xRTp04VNTU14u233xYZGRli/vz5Q3JEnYiGTyjbZkjhI4QQ27dvF5MmTRJGo1FkZmaK559/PmhcURTx2GOPCavVKoxGo5g1a5ZoamoKqjl79qyYP3++iI+PFyaTSSxevFh0dHT0uweGD1F4CmXb1AghhNx9r9B5vV6YzWZ4PB4e/yEKI6Fsm/xsFxFJwfAhIikYPkQkBcOHiKRg+BCRFAwfIpKC4UNEUjB8iEgKhg8RScHwISIpGD5EJAXDh4ikYPgQkRQMHyKSguFDRFIwfIhICoYPEUnB8CEiKRg+RCQFw4eIpGD4EJEUDB8ikoLhQ0RSMHyISAqGDxFJoZfdwED03mTV6/VK7oSIvqp3m+zPjZAjMnzOnj0LAEhNTZXcCRH1paOjA2az+bI1ERk+CQkJAIDm5uYrLuDXmdfrRWpqKlpaWq7pe9ZzPXwhHNaDEAIdHR1ISUm5Ym1Eho9W+8WhKrPZfE3/svUymUxcD+B66CV7PfR3h4AHnIlICoYPEUkRkeFjNBqxevVqGI1G2a1IxfXwBa6HL0TaetCI/pwTIyIaZBG550NEkY/hQ0RSMHyISAqGDxFJEZHh89xzz2HcuHGIjo5Gbm4uamtrZbc0aMrLy3HDDTdgxIgRSEpKwty5c9HU1BRU09XVhaKiIiQmJiI+Ph4FBQVwu91BNc3NzcjPz0dsbCySkpKwcuVKBAKB4VyUQbV27VpoNBosX75cnXetrIeTJ0/i3nvvRWJiImJiYpCdnY1Dhw6p40IIrFq1CsnJyYiJiYHD4cCxY8eCnqOtrQ2FhYUwmUywWCxYsmQJOjs7h3tRgokIs3XrVhEVFSVeeukl0djYKO677z5hsViE2+2W3dqgyMvLEy+//LI4fPiwqK+vF3fddZdIS0sTnZ2das39998vUlNTRVVVlTh06JCYMWOGuOmmm9TxQCAgJk2aJBwOh/jPf/4jdu7cKUaNGiXKyspkLNJVq62tFePGjROTJ08WDz74oDr/WlgPbW1tYuzYseJnP/uZqKmpER9//LHYs2eP+PDDD9WatWvXCrPZLN544w3x7rvviu9///siPT1dnD9/Xq258847xZQpU8SBAwfEv/71LzF+/Hgxf/58GYukirjwufHGG0VRUZH6c09Pj0hJSRHl5eUSuxo6ra2tAoCorq4WQgjR3t4uDAaDqKioUGuOHj0qAAin0ymEEGLnzp1Cq9UKl8ul1mzYsEGYTCbh8/mGdwGuUkdHh8jIyBCVlZXitttuU8PnWlkPjzzyiLj55psvOa4oirDZbOLpp59W57W3twuj0Sj+9Kc/CSGEOHLkiAAgDh48qNbs2rVLaDQacfLkyaFr/goi6m2X3+9HXV0dHA6HOk+r1cLhcMDpdErsbOh4PB4AX36Ytq6uDt3d3UHrIDMzE2lpaeo6cDqdyM7OhtVqVWvy8vLg9XrR2Ng4jN1fvaKiIuTn5wctL3DtrIe//e1vyMnJwY9+9CMkJSVh6tSpeOGFF9Tx48ePw+VyBa0Hs9mM3NzcoPVgsViQk5Oj1jgcDmi1WtTU1AzfwlwgosLnzJkz6OnpCfplAgCr1QqXyyWpq6GjKAqWL1+OmTNnYtKkSQAAl8uFqKgoWCyWoNqvrgOXy9XnOuodixRbt27FO++8g/Ly8ovGrpX18PHHH2PDhg3IyMjAnj17sGzZMjzwwAPYvHkzgC+X43LbhMvlQlJSUtC4Xq9HQkKC1PUQkZ9qv1YUFRXh8OHDePvtt2W3MuxaWlrw4IMPorKyEtHR0bLbkUZRFOTk5OCpp54CAEydOhWHDx/Gxo0bsWjRIsndXZ2I2vMZNWoUdDrdRWc03G43bDabpK6GRnFxMXbs2IE333wTY8aMUefbbDb4/X60t7cH1X91Hdhstj7XUe9YJKirq0NrayumTZsGvV4PvV6P6upqrF+/Hnq9Hlar9ZpYD8nJycjKygqaN2HCBDQ3NwP4cjkut03YbDa0trYGjQcCAbS1tUldDxEVPlFRUZg+fTqqqqrUeYqioKqqCna7XWJng0cIgeLiYrz++uvYu3cv0tPTg8anT58Og8EQtA6amprQ3NysrgO73Y6GhoagX7jKykqYTKaLfpHD1axZs9DQ0ID6+np1ysnJQWFhofrva2E9zJw586JLLT744AOMHTsWAJCeng6bzRa0HrxeL2pqaoLWQ3t7O+rq6tSavXv3QlEU5ObmDsNSXIK0Q90DtHXrVmE0GsWmTZvEkSNHxNKlS4XFYgk6oxHJli1bJsxms3jrrbfE6dOn1enzzz9Xa+6//36RlpYm9u7dKw4dOiTsdruw2+3qeO8p5tmzZ4v6+nqxe/duMXr06Ig6xdyXr57tEuLaWA+1tbVCr9eLJ598Uhw7dkxs2bJFxMbGildeeUWtWbt2rbBYLGLbtm3ivffeE3fffXefp9qnTp0qampqxNtvvy0yMjJ4qn0gnn32WZGWliaioqLEjTfeKA4cOCC7pUEDoM/p5ZdfVmvOnz8vfv7zn4uRI0eK2NhY8YMf/ECcPn066HlOnDgh5syZI2JiYsSoUaPEQw89JLq7u4d5aQbXheFzrayH7du3i0mTJgmj0SgyMzPF888/HzSuKIp47LHHhNVqFUajUcyaNUs0NTUF1Zw9e1bMnz9fxMfHC5PJJBYvXiw6OjqGczEuwq/UICIpIuqYDxF9fTB8iEgKhg8RScHwISIpGD5EJAXDh4ikYPgQkRQMHyKSguFDRFIwfIhICoYPEUnB8CEiKf4/FV2u996v3AQAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 300x300 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/plain": [
       "[0.0, 0.0]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "play(True)[-1]"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "collapsed_sections": [],
   "name": "第9章-策略梯度算法.ipynb",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python [conda env:cuda117]",
   "language": "python",
   "name": "conda-env-cuda117-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
